Spontaneous speech consolidation for spoken language applications

نویسندگان

Chiori Hori

Alexander H. Waibel

چکیده

This paper describes the work done as a part of the International Workshop on Speech Summarization for Information Extraction and Machine Translation (IWSpS) , on spoken language processing including summarization, machine translation and question answering on lecture speech in the Translanguage English Database (TED) corpus . The hypotheses of lecture speech obtained by automatic speech recognition (ASR) system are ill-formed due to the spontaneity of speakers and recognition errors. The overall performance of spoken language processing components is affected by the errors introduced by the ASR system. In order to get more reliable phrases which maintain the original meaning and contribute positively to the total performance of the spoken language system, this paper proposes a consolidation fram ework. The consolidation approach extracts words by excluding redundant and irrelevant information and concatenating words so as to maintain the original meaning. Automatic consolidation performance is evaluated by comparing with manual consolidation by humans using a word accuracy metric . Our approach gives 58% accuracy on ASR output with 70% word accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Understanding Spontaneous Speech

When speech understanding systems are used in real applications, they will have to deal with phenomena peculiar to spontaneous speech. People use language differently when they speak than when they write. Spoken language contains many interjections, filled pauses, etc. Speakers often don't use well-formed sentences. They speak in phrases, have restarts, etc. Systems designed for written or read...

متن کامل

Spontaneous speech: how people really talk and why engineers should care

Spontaneous conversation is optimized for human-human communication, but differs in some important ways from the types of speech for which human language technology is often developed. This overview describes four fundamental properties of spontaneousspeech that present challenges for spoken language applications because they violate assumptions often applied in automatic processing technology.

متن کامل

A Corpus of Spontaneous Speech in Lectures: The KIT Lecture Corpus for Spoken Language Processing and Translation

With the increasing number of applications handling spontaneous speech, the needs to process spoken languages become stronger. Speech disfluency is one of the most challenging tasks to deal with in automatic speech processing. As most applications are trained with well-formed, written texts, many issues arise when processing spontaneous speech due to its distinctive characteristics. Therefore, ...

متن کامل

Inferring linguistic structure in spoken language

We demonstrate the applications of Markov Chains and HMMs to modeling of the underlying structure in spontaneous spoken language. Experiments with supervised training cover the detection of the current dialog state and identi cation of the speech act as used by the speech translation component in our JANUS Speech-to-Speech Translation System. HMM training with hidden states is used to uncover o...

متن کامل

The Voyager Speech Understanding System: A Progress Report

As part of the DARPA Spoken Language System program, we recently initiated an effort in spoken language understanding. A spoken language system addresses applications in which speech is used for interactive problem solving between a person and a computer. In these applications, not only must the system convert the speech signal into text, it must also understand the linguistic structure of a se...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Spontaneous speech consolidation for spoken language applications

نویسندگان

چکیده

منابع مشابه

Understanding Spontaneous Speech

Spontaneous speech: how people really talk and why engineers should care

A Corpus of Spontaneous Speech in Lectures: The KIT Lecture Corpus for Spoken Language Processing and Translation

Inferring linguistic structure in spoken language

The Voyager Speech Understanding System: A Progress Report

عنوان ژورنال:

اشتراک گذاری